59 research outputs found
In All Likelihood, Deep Belief Is Not Enough
Statistical models of natural stimuli provide an important tool for
researchers in the fields of machine learning and computational neuroscience. A
canonical way to quantitatively assess and compare the performance of
statistical models is given by the likelihood. One class of statistical models
which has recently gained increasing popularity and has been applied to a
variety of complex data are deep belief networks. Analyses of these models,
however, have been typically limited to qualitative analyses based on samples
due to the computationally intractable nature of the model likelihood.
Motivated by these circumstances, the present article provides a consistent
estimator for the likelihood that is both computationally tractable and simple
to apply in practice. Using this estimator, a deep belief network which has
been suggested for the modeling of natural image patches is quantitatively
investigated and compared to other models of natural image patches. Contrary to
earlier claims based on qualitative results, the results presented in this
article provide evidence that the model under investigation is not a
particularly good model for natural image
HARD: Hard Augmentations for Robust Distillation
Knowledge distillation (KD) is a simple and successful method to transfer
knowledge from a teacher to a student model solely based on functional
activity. However, current KD has a few shortcomings: it has recently been
shown that this method is unsuitable to transfer simple inductive biases like
shift equivariance, struggles to transfer out of domain generalization, and
optimization time is magnitudes longer compared to default non-KD model
training. To improve these aspects of KD, we propose Hard Augmentations for
Robust Distillation (HARD), a generally applicable data augmentation framework,
that generates synthetic data points for which the teacher and the student
disagree. We show in a simple toy example that our augmentation framework
solves the problem of transferring simple equivariances with KD. We then apply
our framework in real-world tasks for a variety of augmentation models, ranging
from simple spatial transformations to unconstrained image manipulations with a
pretrained variational autoencoder. We find that our learned augmentations
significantly improve KD performance on in-domain and out-of-domain evaluation.
Moreover, our method outperforms even state-of-the-art data augmentations and
since the augmented training inputs can be visualized, they offer a qualitative
insight into the properties that are transferred from the teacher to the
student. Thus HARD represents a generally applicable, dynamically optimized
data augmentation technique tailored to improve the generalization and
convergence speed of models trained with KD
DataJoint: managing big scientific data using MATLAB or Python
The rise of big data in modern research poses serious challenges for data management: Large and intricate datasets from diverse instrumentation must be precisely aligned, annotated, and processed in a variety of ways to extract new insights. While high levels of data integrity are expected, research teams have diverse backgrounds, are geographically dispersed, and rarely possess a primary interest in data science. Here we describe DataJoint, an open-source toolbox designed for manipulating and processing scientific data under the relational data model. Designed for scientists who need a flexible and expressive database language with few basic concepts and operations, DataJoint facilitates multi-user access, efficient queries, and distributed computing. With implementations in both MATLAB and Python, DataJoint is not limited to particular file formats, acquisition systems, or data modalities and can be quickly adapted to new experimental designs. DataJoint and related resources are available at http://datajoint.github.com
DataJoint: managing big scientific data using MATLAB or Python
The rise of big data in modern research poses serious challenges for data management: Large and intricate datasets from diverse instrumentation must be precisely aligned, annotated, and processed in a variety of ways to extract new insights. While high levels of data integrity are expected, research teams have diverse backgrounds, are geographically dispersed, and rarely possess a primary interest in data science. Here we describe DataJoint, an open-source toolbox designed for manipulating and processing scientific data under the relational data model. Designed for scientists who need a flexible and expressive database language with few basic concepts and operations, DataJoint facilitates multi-user access, efficient queries, and distributed computing. With implementations in both MATLAB and Python, DataJoint is not limited to particular file formats, acquisition systems, or data modalities and can be quickly adapted to new experimental designs. DataJoint and related resources are available at http://datajoint.github.com
Natural Image Coding in V1: How Much Use is Orientation Selectivity?
Orientation selectivity is the most striking feature of simple cell coding in
V1 which has been shown to emerge from the reduction of higher-order
correlations in natural images in a large variety of statistical image models.
The most parsimonious one among these models is linear Independent Component
Analysis (ICA), whereas second-order decorrelation transformations such as
Principal Component Analysis (PCA) do not yield oriented filters. Because of
this finding it has been suggested that the emergence of orientation
selectivity may be explained by higher-order redundancy reduction. In order to
assess the tenability of this hypothesis, it is an important empirical question
how much more redundancies can be removed with ICA in comparison to PCA, or
other second-order decorrelation methods. This question has not yet been
settled, as over the last ten years contradicting results have been reported
ranging from less than five to more than hundred percent extra gain for ICA.
Here, we aim at resolving this conflict by presenting a very careful and
comprehensive analysis using three evaluation criteria related to redundancy
reduction: In addition to the multi-information and the average log-loss we
compute, for the first time, complete rate-distortion curves for ICA in
comparison with PCA. Without exception, we find that the advantage of the ICA
filters is surprisingly small. Furthermore, we show that a simple spherically
symmetric distribution with only two parameters can fit the data even better
than the probabilistic model underlying ICA. Since spherically symmetric models
are agnostic with respect to the specific filter shapes, we conlude that
orientation selectivity is unlikely to play a critical role for redundancy
reduction
- …